8 research outputs found

    Automation is Documentation: Functional Documentation of Human-Machine Interaction for Future Software Reuse

    Get PDF
    Preserving software and providing access to obsolete software is necessary and will become even more important for work with any kind of born-digital artifacts. While usability and availability of emulation in digital curation and preservation workflow has improved significantly, productive (re)use of preserved obsolete software is a growing concern, due to a lack of (future) operational knowledge. In this article we describe solutions to automate and document software usage in a way, such that the result is not only instructive but also productive

    How Long Can We Build It? Ensuring Usability of a Scientific Code Base

    Get PDF
    Software and in particular source code became an important component of scientific publications and henceforth is now subject of research data management.  Maintaining source code such that it remains a usable and a valuable scientific contribution is and remains a huge task. Not all code contributions can be actively maintained forever. Eventually, there will be a significant backlog of legacy source-code. In this article we analyse the requirements for applying the concept of long-term reusability to source code. We use simple case study to identify gaps and provide a technical infrastructure based on emulator to support automated builds of historic software in form of source code. &nbsp

    CiTAR - Preserving Software-based Research

    Get PDF
    In contrast to books or published articles, pure digital output of research projects is more fragile and, thus, more difficult to preserve and more difficult to be made available and to be reused by a wider research community. Not only does a fast-growing format diversity in research data sets require additional software preservation but also today’s computer assisted research disciplines increasingly devote significant resources into creating new digital resources and software-based methods. In order to adapt FAIR data principles, especially to ensure re-usability of a wide variety of research outputs, novel ways for preservation of software and additional digital resources are required as well as their integration into existing research data management strategies. This article addresses preservation challenges and preservation options of containers and virtual machines to encapsulate software-based research methods as portable and preservable software-based research resources, provides a preservation plan as well as an implementation. &nbsp

    CiTAR – Citing and Archiving Research

    Get PDF
    While the institutional introduction of infrastructure for the collection and conservation of primary scientific data is currently under construction or already exists, a parallel problem awareness arises for the associated models and methods, in particular for data evaluation. However, there is hardly any usable infrastructure and service offerings yet. Although the DFG recommendations on "good scientific practice" currently only recommend the retention of primary scientific data, the remainder of the recommendation refers to mandatory records of "materials and methods" that are not only necessary for comprehensible results but also for the publication process. If scientific results are to be reproducible, for example for an independent verification, a reconstruction of the experimental setup is necessary. However, in the digital age, with its extremely short life span (and availability) of hardware and software components, replicating a data processing process that is identical in all components can not be achieved solely on the basis of records. CiTAR (Citing and Archiving Research), a three-year Baden-WĂĽrttemberg state project, develops infrastructure to support computer assisted research. One major outcome of this project are means to publish, cite and provide long-term access to virtual research environments. The aim of this project is to develop a cooperative, multidisciplinary technical-organizational service in order to support teaching and research in the further development of "good scientific practice". The service should provide data and scientific methods jointly citable and reproducibly in order to meet the requirements of modern journals. CiTAR realizes re-use of research data and long-term availability in terms of a modern research data management. To achieve the project objectives, three of the four bwFor HPC operators have joined forces to prototype a broader scope in the natural sciences, especially the computational and data-intensive scientific disciplines. The developed service provides automated import of virtual machines and popular container formats like Docker and Singularity. CiTAR assigns persistent identifiers to the imported research environments and provides ressources to re-run the archived objects with external data

    Enabling Portable Data-centric Scientific Software Environments-Connecting Preserved Software with Data Providers

    No full text
    Today’s computer assisted research relies heavily on appropriate infrastructure such as storage and data management services as well as (high performance) computing infrastructure. Of at least similar importance is scientific software, often found as customized software-based setups for processing data or to create novel (software-based) models or simulations. Hence, in order to adapt FAIR data principles to software-based research methods and to ensure re-usability of a wide variety of digital research outputs, not only preservation of these software methods is an important ingredient of a sustainable research management strategy, but also facilitating access to data associated with suitable processing software. Within the CiTAR (Citing and Archving Research)1, an e-Science project, we have developed infrastructure to preserve and to cite software methods and to ensure scalable long-term access and re-use. The service allows researchers to ingest their configured software setup, e.g., in the form of a container or a virtual machine and to re-run these setups without any special knowledge using a web browser or web API. While the service provides convenient APIs and web-based workflows to orchestrate their execution, provisioning of data - e.g., make a data set accessible as an HTTP data stream; if necessary, authenticate the user - remains an open issue. As part of a newly formed science data center (SDC) BioDATEN2 we have addressed this challenge, by developing technology to simplify the publication of preserved software together with a published data-set, and in general, to orchestrate the reproduction of an experiment from different sources, e.g., data-set, metadata and runtime data, with the main focus on vendor-neutral integration into existing infrastructure wherever possible. Authors are then able to link a previously preserved software environment with published data, such that the software may then either reproduce their results from their input data, visualize data such, e.g., through plots or allow interactive exploration of data and results. The main challenge for the integration is to orchestrate the interaction between two services and infrastructures as well as a proper encapsulation of the user interface components, e.g., the data publication platform must embed a connection to the software preservation infrastructure, as well as preparing the research data-set as an input for the desired software process. In context of the aforementioned SDC, an InvenioRDM instance is used as a web-based data publication platform and KeyCloak as an OAuth 2.0 authentication and authorization provider. InvenioRDM stores data in an S3-compatible object storage, but provides its own front-end APIs to access saved objects. Unfortunately, this user-facing API may change over time, such that third-party elements may break. Furthermore, the creation of rich data publications should be as simple as possible, to allow any user to create and maintain them themselves. For this, we have extended the access to preserved scientific software by wrapping it into a standard Web Component. This Web Component is a self-contained HTML element and can be embedded as a Custom Element into the data publication platform’s user interface, a process very similar to embedding a YouTube video. Like any built-in HTML element, it provides a stable interface, e.g., specifies its input data as defined attributes and is able to accept listeners for (lifecycle) events such as start and end of execution. Its stable interface can also be used by the embedding platform to pass OAuth 2.0 compatible access tokens from the publication platform to the preservation infrastructure. By using Shadow DOM, it does not interfere with surrounding user interface/web-page elements even if these change over time. The presented approach is not limited just to bioinformatics, but is designed to cater any scientific community relying on software-based workflows and digital resources. The cloud-based approach allows other services to re-use the proposed solution as a “drop-in”, independently of their technological infrastructure

    Preservation strategies for an internet-based artwork yesterday today and tomorrow - iPRES 2019 Amsterdam

    No full text
    This paper investigates possible preservation strategies for an internet-based artwork and assesses the strategies that best capture the authenticity of the work for future iterations. Two different preservation strategies are applied for the internet-based artwork TraceNoizer.org from 2001. A third one, a Linux Live CD, was carried out by one of the artists. They are compared and evaluated from the perspective of the long-term preservation of the work’s most significant properties. Compared to software- based artworks, the characteristics of internet- based artworks shift the focus of the preservation measures from the stabilization of the software to reduction of server maintenance, protection of server and artwork from internet threats and reduction of external dependencies. This paper suggests solutions how to handle these challenges and discusses its benefits and disadvantages for long-term preservation

    CiTAR – Citable Scientific Software and Software Methods

    No full text
    Moderne Forschungsvorhaben stützen sich nicht nur auf computergestützte Methoden und digitale Ressourcen, sondern entwickeln häufig auch spezifische Software und software-gestützte Prozesse. Ein wesentliche Säule der Wissenschaft ist die Nachvollziehbarkeit sowie gegebenenfalls die Reproduktion oder Verifikation veröffentlichter Erkenntnisse. Für computergestützte Forschung wird dies zunehmend zu einer Herausforderung, insbesondere durch die komplexe Rekonstruktion softwarebasierter Methoden, ihrer spezifischen Konfiguration und technischen Abhängigkeiten. Dieser Beitrag stellt CiTAR1 - Citing and Archiving Research vor, ein dreijähriges Projekt gefördert durch das Ministeriums für Wissenschaft und Kunst des Landes Baden-Württemberg, das eine Infrastruktur zur Unterstützung computergestützter Forschung entwickelt. Im Fokus stehen dabei die Methoden, d.h. softwaregestützte Prozesse oder Modelle zur Erstellung oder Auswertung von Daten. Das wesentliche Projektziel ist es, diese Methoden nachvollziehbar und nachnutzbar nachzuweisen, so dass diese ebenso zitierbar und publizierbar werden wie derzeit schon Forschungsdate
    corecore